Identifying ISI-indexed articles by their lexical usage: A text analysis approach

نویسندگان

  • Mohammadreza Moohebat
  • Ram Gopal Raj
  • Sameem Abdul Kareem
  • Dirk Thorleuchter
چکیده

This research creates an architecture for investigating the existence of probable lexical divergences between articles, categorized as Institute for Scientific Information (ISI) and non-ISI, and consequently, if such a difference is discovered, to propose the best available classification method. Based on a collection of ISIand non-ISI-indexed articles in the areas of business and computer science, three classification models are trained. A sensitivity analysis is applied to demonstrate the impact of words in different syntactical forms on the classification decision. The results demonstrate that the lexical domains of ISI and non-ISI articles are distinguishable by machine learning techniques. Our findings indicate that the support vector machine identifies ISIindexed articles in both disciplines with higher precision than do the Naïve Bayesian and K-Nearest Neighbors techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Native and Non-native Use of Lexical Bundles in Discussion Section of Political Science Articles

The study of lexical bundles, among types of text analysis, is gaining importance over the others in the last century. The present study employed a frequency-based analysis approach to the use of lexical bundles. The discussion section of 60 political science articles, with corpora around 253,063 words were investigated in three aspects of structure, form, and function of lexical bundles. The p...

متن کامل

Frequency and Types of Hedges and Boosters in Academic Medical and Teaching Research Articles

Background and aim: To gain an in-depth insight and to achieve greater precision in the reasoning and writing of academic papers, the present study investigated the types, frequencies of hedges and boosters used in Iranian and foreign academic Medical and Teaching research articles (RAs). Material and methods: This descriptive-analytical study included 60 research articles. Iranian articles on ...

متن کامل

Published vs. Postgraduate Writing in Applied Linguistics: The Case of Lexical Bundles

Abstract: Lexical bundles, as building blocks of coherent discourse, have been the subject of much research in the last two decades. While many of such studies have been mainly concerned with  exploring  variations  in  the  use  of  these  word  sequences  across  different  registers  and disciplines, very few have addressed the use of some particular groups of lexical bundles within some gen...

متن کامل

تحلیل استنادی و هم‌تالیفی تولیدات علمی پژوهشگران ایرانی در حوزه ایمنی‌شناسی در پایگاه اطلاعاتی ISI: گزارش کوتاه

Background: Currently, share of the scientific output, citation per paper, and co-authorship for articles indexed in databases such as ISI Web of Science, are very important criteria for the evaluation and ranking of countries, researchers, institutes, articles, disciplines and journals in the world. Therefore, the main objectives of the study were to determine co-authorship, the average citati...

متن کامل

A Corpus-based Study of Lexical Bundles in Discussion Section of Medical Research Articles

There has been increasing interest in utilizing corpora in linguistic research and pedagogy in recent years. Rhetorical organization of different sections of research articles may appear similar in various disciplines, but close examination may show subtle differences nonetheless. One of the features that has been at the center of attention especially in recent years is the idiomaticity of a di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 66  شماره 

صفحات  -

تاریخ انتشار 2015